Serveur d'exploration sur la TEI

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Recycling Annotated Parallel Corpora for Bilingual Document Composition

Identifieur interne : 000318 ( Main/Exploration ); précédent : 000317; suivant : 000319

Recycling Annotated Parallel Corpora for Bilingual Document Composition

Auteurs : Arantza Casillas [Espagne] ; Joseba Abaitua [Espagne] ; Raquel Martinez [Espagne]

Source :

RBID : ISTEX:4F515A6D637D7CE9AC72B228D713AA6632989C07

Abstract

Abstract: Parallel corpora enriched with descriptive annotations facilitate multilingual authoring development. Departing from an annotated bitext we show how SGML markup can be recycled to produce complementary language resources. On the one hand, several translation memory databases together with glossaries of proper nouns have been produced. On the other, DTDs for source and target documents have been derived and put into correspondence. This paper discusses how these resources have been automatically generated and applied to an interactive bilingual authoring system. This tool is capable of handling a substantial proportion of text both in the composition and translation of structured documents.

Url:
DOI: 10.1007/3-540-39965-8_12


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct:series">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Recycling Annotated Parallel Corpora for Bilingual Document Composition</title>
<author>
<name sortKey="Casillas, Arantza" sort="Casillas, Arantza" uniqKey="Casillas A" first="Arantza" last="Casillas">Arantza Casillas</name>
</author>
<author>
<name sortKey="Abaitua, Joseba" sort="Abaitua, Joseba" uniqKey="Abaitua J" first="Joseba" last="Abaitua">Joseba Abaitua</name>
</author>
<author>
<name sortKey="Martinez, Raquel" sort="Martinez, Raquel" uniqKey="Martinez R" first="Raquel" last="Martinez">Raquel Martinez</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:4F515A6D637D7CE9AC72B228D713AA6632989C07</idno>
<date when="2000" year="2000">2000</date>
<idno type="doi">10.1007/3-540-39965-8_12</idno>
<idno type="url">https://api.istex.fr/document/4F515A6D637D7CE9AC72B228D713AA6632989C07/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000099</idno>
<idno type="wicri:Area/Istex/Curation">000099</idno>
<idno type="wicri:Area/Istex/Checkpoint">000266</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000266</idno>
<idno type="wicri:doubleKey">0302-9743:2000:Casillas A:recycling:annotated:parallel</idno>
<idno type="wicri:Area/Main/Merge">000344</idno>
<idno type="wicri:Area/Main/Curation">000318</idno>
<idno type="wicri:Area/Main/Exploration">000318</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Recycling Annotated Parallel Corpora for Bilingual Document Composition</title>
<author>
<name sortKey="Casillas, Arantza" sort="Casillas, Arantza" uniqKey="Casillas A" first="Arantza" last="Casillas">Arantza Casillas</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Departamento de Automática, Universidad de Alcalá</wicri:regionArea>
<wicri:noRegion>Universidad de Alcalá</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Espagne</country>
</affiliation>
</author>
<author>
<name sortKey="Abaitua, Joseba" sort="Abaitua, Joseba" uniqKey="Abaitua J" first="Joseba" last="Abaitua">Joseba Abaitua</name>
<affiliation></affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Espagne</country>
</affiliation>
</author>
<author>
<name sortKey="Martinez, Raquel" sort="Martinez, Raquel" uniqKey="Martinez R" first="Raquel" last="Martinez">Raquel Martinez</name>
<affiliation wicri:level="4">
<country xml:lang="fr">Espagne</country>
<wicri:regionArea>Depatamento de Sis. Informáticos y Programación, Facultad de Matemáticas, Universidad Complutense de Madrid</wicri:regionArea>
<placeName>
<settlement type="city">Madrid</settlement>
<region nuts="2" type="region">Communauté de Madrid</region>
</placeName>
<orgName type="university">Université complutense de Madrid</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Espagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2000</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">4F515A6D637D7CE9AC72B228D713AA6632989C07</idno>
<idno type="DOI">10.1007/3-540-39965-8_12</idno>
<idno type="ChapterID">12</idno>
<idno type="ChapterID">Chap12</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Parallel corpora enriched with descriptive annotations facilitate multilingual authoring development. Departing from an annotated bitext we show how SGML markup can be recycled to produce complementary language resources. On the one hand, several translation memory databases together with glossaries of proper nouns have been produced. On the other, DTDs for source and target documents have been derived and put into correspondence. This paper discusses how these resources have been automatically generated and applied to an interactive bilingual authoring system. This tool is capable of handling a substantial proportion of text both in the composition and translation of structured documents.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Espagne</li>
</country>
<region>
<li>Communauté de Madrid</li>
</region>
<settlement>
<li>Madrid</li>
</settlement>
<orgName>
<li>Université complutense de Madrid</li>
</orgName>
</list>
<tree>
<country name="Espagne">
<noRegion>
<name sortKey="Casillas, Arantza" sort="Casillas, Arantza" uniqKey="Casillas A" first="Arantza" last="Casillas">Arantza Casillas</name>
</noRegion>
<name sortKey="Abaitua, Joseba" sort="Abaitua, Joseba" uniqKey="Abaitua J" first="Joseba" last="Abaitua">Joseba Abaitua</name>
<name sortKey="Casillas, Arantza" sort="Casillas, Arantza" uniqKey="Casillas A" first="Arantza" last="Casillas">Arantza Casillas</name>
<name sortKey="Martinez, Raquel" sort="Martinez, Raquel" uniqKey="Martinez R" first="Raquel" last="Martinez">Raquel Martinez</name>
<name sortKey="Martinez, Raquel" sort="Martinez, Raquel" uniqKey="Martinez R" first="Raquel" last="Martinez">Raquel Martinez</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000318 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000318 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:4F515A6D637D7CE9AC72B228D713AA6632989C07
   |texte=   Recycling Annotated Parallel Corpora for Bilingual Document Composition
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024